Gut microbiota and fecal short chain fatty acids differ with adiposity and country of origin: The METS-Microbiome Study

The relationship between gut microbiota, short chain fatty acid (SCFA) metabolism, and obesity is still not well understood. Here we investigated these associations in a large (n=1904) African origin cohort from Ghana, South Africa, Jamaica, Seychelles, and the US. Fecal microbiota diversity and SCFA concentration were greatest in Ghanaians, and lowest in the US population, representing the lowest and highest end of the epidemiologic transition spectrum, respectively. Obesity was significantly associated with a reduction in SCFA concentration, microbial diversity and SCFA synthesizing bacteria. Country of origin could be accurately predicted from the fecal microbiota (AUC=0.97), while the predictive accuracy for obesity was inversely correlated to the epidemiological transition, being greatest in Ghana (AUC = 0.57). The findings suggest that the microbiota differences between obesity and non-obesity may be larger in low-to-middle-income countries compared to high-income countries. Further investigation is needed to determine the factors driving this association.


Introduction
Obesity, which affects more than 600 million adults worldwide ("Obesity and Overweight" n.d.), over a third of Americans (Hales et al. 2020), and accounts for over 60% of deaths related to high body mass index (BMI) (Tseng and Wu 2019), remains an ongoing global health epidemic that continues to worsen at an alarming rate. A major driver of obesity is the adoption of a western lifestyle, which is characterized by excessive consumption of ultra-processed foods. Obesity is a major risk factor for type 2 diabetes, and according to the most recent National Diabetes Statistics Report almost 13% of the adult US population now have diabetes. Not only do 49.6% of adult African Americans present with obesity but over 17% of them now have diabetes, and are 1.5 times as likely to present with type 2 diabetes compared to whites ("National Diabetes Statistics Report" 2022). Populations of African origin outside of the US are experiencing similar fates, as the prevalence of obesity among adults living in Sub-Saharan Africa is greater than 13%, and higher than the global obesity prevalence for adults (Agyemang et al. 2016). This has been accompanied by dramatic increases in the prevalence of non-communicable diseases such as type two diabetes and hypertension among people of African origin (Roth et al. 2020; Gouda et al. 2019). Therefore, disrupting the rapidly expanding obesity epidemic, particularly among African origin populations is critical to controlling the cardiometabolic disorder epidemic (Geng et al. 2022). However, successfully managing and treating obesity and its comorbidities, and speci cally maintaining weight loss long-term, is particularly challenging due to an incomplete understanding of the heterogeneous and complex etiopathology, as well as additional challenges facing populations experiencing rapid urbanization (Nordmo, Danielsen, and Nordmo 2020;Geng et al. 2022; Barone et al. 2022). The epidemiologic transition is a model able to capture these shifts in dietary and rural to urban movements and is characterized by diets that are high in ultra-processed foods with a signi cant loss in ber, as evidenced in the US, where less than 50% of the population meet dietary ber recommendations (Dahl and Stewart 2015).
well-de ned cohorts from the Modeling the Epidemiologic Transition Study (METS) offers a unique opportunity to examine the issues since they are more representative of most of the world's population. METS has longitudinally followed an international cohort of approximately 2,500 African origin adults spanning the epidemiologic transition from Ghana, South Africa, Jamaica, Seychelles, and the US since 2010 to investigate differences in health outcomes utilizing the framework of the epidemiologic transition. Pioneering microbiome studies from the METS cohorts reveal that cardiometabolic risk factors including obesity is signi cantly associated with reduced microbial diversity, and the enrichment of speci c taxa and predicted functional traits in a geographic-speci c manner (Dugas, Bernabé, et al. 2018;Fei et al. 2019). While yielding valuable descriptions of the connections between the gut microbiota ecology and disease, particularly obesity, as well as pioneering the efforts of microbiome studies of populations of African origin on different stages of the ongoing nutritional epidemiologic transitions, these studies, however, have applied small sample size (N=100 to N=655), and also did not utilize all the countries in the METS cohort. Thus, uncertainties remain as to the precise interpretation of the microbiome-obesity associations, which hampers further progress towards diagnostic and clinical applications.
Our new study METS-Microbiome investigated associations between the gut microbiota composition and functional patterns, concentrations of fecal SCFAs and obesity in a large (N = 1,904) adult population cohort of African origin, comprised of Ghana, South Africa, Jamaica, Seychelles, and the US spanning the epidemiologic transition (Dugas, Lie, et al. 2018; Luke et al. 2011). The central hypothesis is that shifts towards the highest end of the epidemiologic transition spectrum is associated with alterations in microbiota diversity and community composition, reductions in levels of fecal SCFAs and obesity.

Results
Obesity differs signi cantly across the epidemiological transition. From 2018-2019, the METS-Microbiome study recruited 2,085 participants (~60% women) ages 35-55 years old from ve different sites (Ghana, South Africa, Jamaica, Seychelles, and US). Of these participants, 1,249 have been followed on a yearly basis since 2010 under the parent METS study. Data from 1,867 participants with complete data sets were used in this analysis. Overall mean age was 42.5 ± 8.0 years (Table 1). Mean fasted blood glucose was 105.2 ± 39.4 mg/dL, mean systolic blood pressure was 123.4±18.1 mm Hg and mean diastolic blood pressure was 77.2 ± 13.1 (Table 1). When compared to the high-income countries (Jamaica, Seychelles, and US), both women and men from the lower-and middle-income countries (Ghana and South Africa) had signi cantly lower BMI, fasted blood glucose and blood pressure (systolic and diastolic). Mean BMI was lowest in the South African men (22.3 kg/m2 ± 4.1) and highest in US women (36.3 kg/m2 ± 8.8). When compared to the US, all sites had signi cantly lower prevalence of obesity (p<0.001 for all sites except for Seychelles: p=0.02). Prevalence of hypertension was lowest in Ghanaian men (33.1%) and highest in US men (72.7%). Prevalence of diabetes was lowest in South African women and men (3.5% for women and men) and highest for Seychellois men (22.8%). When compared to the US, prevalence of hypertension and diabetes was signi cantly lower in countries at the lower end of the spectrum of HDI (i.e., Ghana and South Africa) when compared to the US (p<0.001).
Microbial community composition and predicted metabolic potential differs signi cantly between countries and correlates with obesity. Following the removal of samples that had fewer than 6,000 reads and features less than ten reads in the entire dataset, a total of 433,364,873 16S rRNA gene sequences were generated from the 1,873 fecal samples which were clustered into 13,254 ASVs. Country of origin describes most of the variation in microbial diversity and composition, with signi cant differences in both alpha and beta diversity. Although there were major variations in alpha diversity between countries and large degree of inter-individual variation within countries, Ghana showed signi cantly greater diversity for all the alpha diversity metrics (Observed ASVs, Shannon Diversity and Faith's phylogenetic diversity) when compared to all other countries. The Seychelles and US had the lowest alpha diversity (Fig. 1). The stool microbiota alpha diversity of non-obese individuals was signi cantly greater when compared with that of obese individuals (Fig. 1). Beta diversity was also signi cantly different between countries (Fig. 1 Next, we compared fecal microbiota diversity between obese individuals with their non-obese counterparts within each country independently. Greater alpha diversity was detected in non-obese subjects in the Ghanaian (Observed ASVs, Faith PD; p<0.05) and South African cohorts (Observed ASVs; p<0.05) only (Supplementary Table 1). Similarly, signi cant differences in beta diversity between obese and non-obese microbiota were observed in Ghana (Unweighted UniFrac; p<0.05), South Africa (Unweighted UniFrac; p<0.05) and US (Weighted UniFrac; p<0.05) data sets (Supplementary Tables 2 & 3).
These results suggest that the beta diversity differences observed in the Ghanaian and South African participants may partly be due to the presence of more abundant fecal microbiota taxa in the fecal samples whereas among the US participants, the differences may be related to the abundance of rare taxa. Collectively, these observations suggest that country is a major driver of the variance in gut microbiota diversity and composition among participants with or without obesity with marked contributions from Ghana and South Africa and modest contribution from the US in the overall cohort.
We also examined whether country of origin or obesity relates to the presence of speci c microbial genera frequently used to stratify humans into enterotypes (Arumugam et al. 2011). As expected, large differences in enterotype between the countries were observed. The Prevotella enterotype (P-type) was enriched on the African continent, with 81% and 62% in Ghanaians and South Africans respectively while Bacteroides enterotype (B-type) was dominant in the US (75%), Jamaican cohorts (68%), and comparable proportions of both enterotypes among individuals from Seychelles. Further, obese individuals displayed a greater abundance of B-type whereas a higher proportion of the P-type associated with the non-obese group (Supplementary Table 4). Consistent with this observation, the abundance of B-type correlated with higher BMI (p=0.004) than P-type. Signi cantly greater diversity and increased levels of total SCFA were observed in participants in the P-type (Supplementary Table 4). The relative abundance of shared and unique features between the different countries illustrated by the Venn diagram showed that Ghana carries the largest proportion of unique taxa than the other countries, and US the lowest (Fig. 1).
Microbial taxonomic features predict obesity overall and within each country. Using supervised Random Forest machine learning, the predictive capacity of the gut microbiota features in stratifying individuals to country of origin, sex, or with metabolic phenotypes were assessed. The predictive performance of the model was calculated by area under the receiver operating characteristic curve (AUC) analysis, which showed a high accuracy for country of origin (AUC = 0.97), and a comparatively lower level of predictive accuracy for obese state (AUC = 0.65) (Fig. 3). Sex was predicted with AUC = 0.75, the diabetes status with AUC = 0.63, hypertensive status with AUC = 0.65 and glucose status with AUC = 0.66. Random Forest analysis was also used to identify the top 30 microbial taxonomic features that differentiate between countries and obese states. Similar to the ANCOMBC results, Prevotella and Streptococcus were at a greater proportion in the microbiota of Ghanaian and non-obese individuals, whereas Mogibacterium was at a greater proportion in the South African cohort. A greater proportion of Megasphaera was associated with the Jamaican cohort, while a greater proportion of Ruminococcaceae was observed in the American microbiota. Weisella, which was identi ed as having a signi cantly greater proportion in the Ghanaian cohort using ANCOMBC, was observed to be a discriminatory feature for Seychelles microbiota using Random Forest ( Supplementary Fig. 2).
Similarly, the predictive capacity of the gut microbiota features in stratifying individuals by obese state was assessed at each of the ve study sites. The predictive performance of the model was calculated by AUC analysis, which showed a moderate accuracy for obese state for all sites, namely, Ghana (AUC Predicted genetic metabolic potential differs by country and obesity status. The predicted potential microbial functional traits resulting from the compositional differences in microbial taxa between countries and obese state were assessed. PICRUSt2 predicted a total of 372 MetaCyc functional pathways. ANCOM-BC analysis adjusted for sex, age and BMI identi ed 67 pathways (p< 0.05; false discovery rate (fdr)-corrected), LFC>1.4) that accounted for discriminative features between the 4 different countries with the US (Supplementary Fig. 4). In comparison with US, MetaCyc pathways differentially increased in Ghana and Jamaica include methylgallate degradation, norspermidine biosynthesis (PWY-6562), gallate degradation I pathway, gallate degradation II pathway, histamine degradation (PWY-6185), and toluene degradation III (via p-cresol) (PWY-5181). South African samples had a greater proportion of L-glutamate degradation VIII (to propanoate) (PWY-5088), isopropanol biosynthesis (PWY-6876), creatinine degradation (PWY-4722), adenosyl cobalamin biosynthesis (anaerobic) (PWY-5507), respiration I (cytochrome c) (PWY-3781). MetaCyc pathways linked to norspermidine biosynthesis (PWy-6562), mycothiol biosynthesis (PWY1G-0), were at a greater proportion in the Seychelles samples, whereas reductive acetyl coenzyme A (CODH-PWY), and chorismate biosynthesis II (PWy-6165) were depleted in the US samples. ANCOM-BC analysis adjusted for site, sex and age identi ed 24 predicted pathways that differentiated between obese and non-obese individuals ( Supplementary Fig. 4). Notably, the microbiota of non-obese individuals had a greater proportion of predicted pathways including the TCA cycle, amino acid metabolism (P162-PWY, PWY-5154, PWY-5345), ubiquinol biosynthesis-related pathways (PWY-5855, PWY-5856, PWY-5857, PWY-6708, UBISYN-PWY), cell structure biosynthesis and nucleic acid processing (PWY0 845, PYRIDOXSYN-PWY).
Several gut microbial predicted genes involved in LPS biosynthesis differentially enriched among the countries (p< 0.05; false discovery rate (fdr)-corrected) were identi ed. In particular, the relative abundance of speci c LPS genes (K02560, K12973, K02849, K12979, K12975, K12974) were signi cantly enriched in Ghana, South Africa, Jamaica, and Seychelles when compared with US. Higher proportions of LPS genes including K12981, K12976 K09953, K03280 were signi cantly increased in Seychelles samples in comparison with US samples and also signi cantly increased in the US cohorts in comparison with participants from Ghana, South Africa, and Seychelles. US samples had a greater proportion of the following genes (K15669, K09778, K07264, K03273, K03271) in comparison with the other 4 countries ( Supplementary Fig. 6). Non-obese individuals had a greater abundance of predicted genes encoding LPS biosynthesis (K02841, K02843, K03271, K03273, K19353, K02850) whereas only 1 LPS gene (K02841) differentially elevated in the non-obese group ( Supplementary Fig. 6). All analyses were adjusted for country, sex, BMI and age (fdr-corrected p < 0.05).
Microbial community composition and taxonomy correlate with observed fecal SCFA concentrations. All countries had signi cantly higher weight-adjusted fecal total SCFA levels when compared to the US participants (p<0.001), with Ghanaians having the highest weight-adjusted fecal total SCFA levels (Supplementary Table 5). When compared to their obese counterparts, non-obese participants had signi cantly higher weight-adjusted fecal total and individual SCFA levels (Supplementary Table 6). Total SCFA levels displayed weak, but signi cantly positive correlation with Shannon diversity (r = 0.0.074). A similar trend was observed in the different individual SCFAs, namely valerate (r = 0.19), butyrate (r = 0.12), propionate (r = 0.073) and acetate (r = 0.058) (Fig. 4). Observed ASVs were not signi cantly correlated with total SCFAs (p>0.05). Levels of acetate, butyrate and propionate exhibited strong signi cant correlations with total SCFA, whereas valerate levels signi cantly correlated negatively (r = -0.09) with total SCFAs. Next, we assessed if levels of total SCFAs could be predicted by a mixed model. Country explained 45.7% of the variation in SCFAs. No signi cant effect was explained either by obesity or Shannon diversity.
To explore the connection between SCFAs with gut microbiota, Spearman correlations between taxa that were proportionally signi cantly different between countries and concentrations of SCFAs were determined. Valerate negatively correlated with the proportion of Clostridium, Prevotella, Faecalibacterium, Roseburia and Streptococcus, which were all positively correlated with acetate, propionate, and butyrate. Similarly, the proportions of Christensenellaceae, Eubacterium, and UCG 002 (Ruminococcaceae) were signi cantly positively associated with valerate, and negatively correlated with acetate, propionate, and butyrate. In addition, only a single ASV annotated to Ruminococcus was observed to be positively associated with all 4 SCFAs (Fig. 5). Similarly, Spearman's rank correlation coe cients were calculated between the differentially abundant ASVs identi ed between obese and nonobese group with concentrations of SCFAs. Broadly, the proportions of most ASVs were signi cantly positively associated with acetate in comparison with the other 3 SCFAs. Consistent with the correlations mentioned above, valerate negatively correlated with most ASVs that were found to be positively correlated with the three major SCFAs, acetate, propionate, and butyrate and vice versa. The relative proportions of ASVs belonging to Allisonella, Erysipelotrichaceae and Libanicoccus positively correlated with acetate, propionate, and butyrate, whereas signi cantly negative relationships were observed between Parabacteroides and Bacteroides abundances with the aforementioned SCFAs. Valerate showed signi cantly positive associations with Oscillospiralles and Ruminococcaceae abundances and signi cantly negative correlations with Lachnospira and Eggerthella abundances (Fig. 5).

Discussion
By leveraging a well characterized large population-based cohort of African origin residing in geographically distinct regions of Ghana, South Africa, Jamaica, Seychelles, and the US, we examined the relationships between gut microbiota, SCFAs and adiposity. Our data revealed profound variations in gut microbiota, which are re ected in the signi cant changes in community composition, structure, and predicted functional pathways as a function of population obesity and geography, despite their shared ancestral background. Our data further revealed an inverse relation between fecal SCFA concentrations, microbial diversity, and obesity; importantly, the utility of the microbiota in predicting whether an individual was lean or obese was inversely correlated with the income-level of the country of origin. Overall, our ndings are important for understanding the complex relationships between the gut microbiota, population lifestyle and the development of obesity, which may set the stage for de ning the mechanisms through which the microbiome may shape health outcomes in populations of African origin.
As reported previously our data showed that geographic origin can modulate the composition of the gut microbiota. Our ndings were also consistent with our previous METS studies (Fei et  We also inferred the metabolic capacity of the gut microbiota associated with the different countries. Several metabolic pathways linked to carrier, cofactor and vitamin biosynthesis, biosynthesis/degradation of amines, amino acids, aromatic xenobiotics, and tricarboxylic acid (TCA) cycle were differentially enriched between the different countries compared with the US. These pathways are involved in biochemical reactions that regulate several processes including energy metabolism, in ammation, epigenetic processes, and oxidative stress. Participants from Ghana and Jamaica were enriched for gallate degradation, which can result in phenolic catechin metabolites which are thought to alleviate obesity-related pathologies (Marchesi et al. 2016;Liu et al. 2021). Additionally, glutamate metabolism, which can be fermented to butyrate and propionate, was enriched in South Africans and Ghanaians compared to the US. In the Seychelles, actinobacterial mycothiol biosynthesis was upregulated, which is involved in antioxidant activity and the removal of toxic compounds from cells (Newton, Buchmeier, and Fahey 2008). We further identi ed an increase in SCFA synthesis pathways, e.g. acetyl coenzyme A pathway, threonine biosynthesis, and leucine degradation in the microbiomes of all four countries compared to the US. Further studies are required to evaluate the potential causal relations of these gut microbial functions with health outcomes using shotgun metagenomic sequencing. . We also detected several butyrate producing ASVs including Eubacterium, Alistipes, Clostridium and Odoribacter to be proportionally enriched in individuals who were non-obese. We observed that obese individuals presented a greater abundance of Lachnospira, which does produce SCFAs, a nding also consistent with our prior study in the same population ( . One explanation may be in differences lifestyle factors, including medication, activity, and pollutant exposure, which could also impact intestinal absorption in western countries. We note that fecal SCFA concentrations are not a direct measure of intestinal SCFA production, but rather re ect a net result of the difference between production and absorption (Canfora, Jocken, and Blaak 2015). Studies using stable isotopes to measure SCFA dynamics would improve interpretation of dichotomy.
While SCFAs associate with obese phenotype, another mechanism underpinning obesity is metabolic endotoxemia. An increase in Proteobacteria, which often accompanies a high fat/high sugar diet, is often associated with an increase in circulating lipopolysaccharide (LPS) and H 2 S, which provoke low-grade in ammation, increased intestinal permeability, and clock gene disruption in the liver, which associate In obese individuals, as well as SCFA metabolism, we also detected marked depletion in pathways involved in cell structure, vitamin B6, NAD, and amino acid biosynthesis. This suggests that pathways important for growth and energy homeostasis are disrupted in individuals with obesity. We also noted an enrichment of the formaldehyde assimilation I (serine pathway) pathway. Endogenous formaldehyde produced at su cient levels has carcinogenic properties and detrimental effects on genome stability. To counteract this reactive molecule, organisms have evolved a detoxi cation system that converts formaldehyde to formate, a less reactive molecule that can be used for nucleotide biosynthesis (Reingruber and Pontel 2018; N. H. Chen et al. 2016). Thus, we may infer that the pattern of increased formaldehyde assimilation pathway in our data might result from a defect or diminished capacity of formaldehyde detoxi cation system pathway, an assumption which requires further veri cation. A study reported increases in the abundance of formaldehyde assimilation pathway in a depressed group when compared with non-depressed controls (S.-Y. Kim et al. 2022). We are the rst to show that the gut of obese participants is enriched in the formaldehyde assimilation pathway. Although we do not understand the mechanistic details, it is known that toxic formaldehyde is generated along with reactive oxygen species during in ammatory processes (N. H. Chen et al. 2016). Thus, an increased capacity for formaldehyde pathway may indicate a microbiome-induced increase in reactive oxygen species in the gut of obese individuals. Indeed, prior work has identi ed induction of oxygen stress by microbial perturbations as one of the mechanisms by which the microbiome can promote weight gain and insulin resistance (J. Qin et al. 2012). The speci c alterations of the gut microbiota and the associated predicted functionality may constitute a potential avenue for the development of microbiome-based therapeutics to treat obesity and/or to promote and sustain weight loss.
Study strengths and limitations. While our study has several strengths including a large sample size, diverse population along an epidemiological transition gradient with a comprehensive dataset that allowed the exclusion of the potential effects of origin as well as control of potential interpersonal covariates, and use of validated and standard tools for data collection, we acknowledge some limitations as well. First, the cross-sectional nature of our study design is unable to establish temporality or identify mechanisms by which the gut microbiome may causally in uence the observed associations. In that regard, we expect that prospective data from the METS cohort study will provide the basis to assess the longitudinal association between gut microbiota composition, metabolites, and obesity, and we have an ongoing study exploring the potential correlations longitudinally. The use of 16S rRNA sequencing in our analysis for inferences on microbial functional ecology inherently has its limitations for drawing conclusions on species and strain level functionality due to its low resolution. Nevertheless, our results provide insight into the relationship between obesity, gut microbiota, and metabolic pathways in individuals of African origin across different geographies, stimulating further examination of large-scale studies using multi-omic approaches with deeper taxonomic and functional resolution and animal transplantation studies to investigate potentially novel microbial strains and to explore the clinical relevance of the observed metabolic differences.

Conclusion
Our study analyzed the relationship between gut microbiota composition, SCFA concentrations, and obesity in a cohort of African origin from different countries. Ghanaian participants had the most diverse microbiota, and the American cohort had the least. Obese individuals had different gut microbiota composition and function compared to non-obese individuals. Non-obese participants had more SCFAproducing microbes and higher total SCFA concentrations in feces. The predictive accuracy of the microbiota for obesity was greatest in low-income countries, suggesting that lifestyle traits in highincome countries may increase obesity risk even for lean individuals. Alterations in the gut microbiota and associated metabolic functions could guide the development of microbiome-based solutions to treat obesity. Further studies using multi-omic approaches are needed to con rm the identi ed taxonomic and metabolic signatures.

Methods
Study Cohort. Since 2010, METS, and the currently funded METS-Microbiome study has longitudinally followed an international cohort of African origin adults spanning the epidemiologic transition from Ghana, South Africa, Jamaica, Seychelles, and US (Dugas, Lie, et al. 2018;Luke et al. 2011). METS utilizes the framework of the epidemiologic transition to investigate differences in health outcomes based on country of origin. The epidemiologic transition is de ned using the United Nations Human Development Index (HDI) as an approximation of the epidemiologic transition. Ghana represents a lowermiddle income country, South Africa represents a middle-income country, Jamaica and Seychelles represent high income countries and the US represents a very high-income country. This framework has allowed us to investigate aspects of increased Westernization throughout the world (ex. increased consumption of ultra-processed foods) are related to increased prevalence of obesity, diabetes and cardiometabolic diseases. Our data from the original METS cohort demonstrate that the epidemiologic transition has altered habitual diets in the international METS sites, and that reduced ber intake is associated with higher metabolic risk, in ammation, and obesity across the epidemiologic transition  Richardson et al. 1989). The LC-MC/MS analysis was completed on an AB Sciex Qtrap 5500 coupled to Agilent UPLC/HPLC system. All samples were analyzed by Agilent poroshell 120 EC-C18 Column, 100Å, 2.7 µm, 2.1 mm X 100 mm coupled to an Agilent UPLC system, which was operated at a ow rate of 400 µl/min. A gradient of buffer A (H 2 0, 0.1% Formic acid) and buffer B (Acetonitrile, 0.1% Formic acid) were applied as: 0 min, 30% of buffer B; increase buffer B to 100% in 4 min; maintain B at 100% for 5 min. The column was then equilibrated for 3 min at 30% B between the injections with the MS detection is in negative mode. The MRM transitions of all targeted compounds include the precursor ions and the signature production ion. Unit resolution is used for both analyzers Q1 and Q3. The MS parameters such as declustering potential, collision energy and collision cell exit potential are optimized in order to achieve the optimal sensitivity. SCFAs are presented as individual SCFAs (μg/g), including: butyric acid, propionic acid, acetic acid and valeric acid, as well as total SCFAs (sum of 4).
METS data showed Ghanaians consumed the greatest amount of both soluble and insoluble ber and had the lowest percentage energy from fat (42.5% of the Ghanaian cohort, dietary ber intake: 24.9 g ± 9.7g/day). The US has the highest proportion of energy from fat and the lowest ber intake of the ve sites (3.2% of the US cohort, dietary ber intake: 14.2 g ± 7.1 g/day). Knight 2005), generated in phyloseq. The Bacteroides Prevotella ratio was calculated by dividing the abundance of the genera Bacteroides by Prevotella. Participants were classi ed into Bacteroides enterotype (B-type) if the ratio was greater than 1, otherwise Prevotella enterotype (P-type). For differential abundance analysis, samples were processed to remove exceptionally rare taxa. First, the non-rare ed reads were ltered to remove samples with < 10,000 reads. Next, ASVs with fewer than 50 reads in total across all samples and/or were present in less than 2% of samples were excluded. This retained 2061 ASVs across 1694 samples. The retained ASVs were binned at genus level, and subsequently used in the analysis of compositions of microbiomes with bias correction (ANCOMBC; (H. Lin and Peddada 2020) to determine speci c taxa differentially abundant across sites or obese phenotype. ANCOM-BC is a statistical approach that accounts for sampling fraction, normalizes the read counts by a process identical to log-ratio transformations while controlling for false discovery rates and increasing power. Site, age, sex, BMI were added as covariates in the ANCOM-BC formula to reduce the effect of confounders.
Random forest classi er: Random Forest supervised learning models implemented in Qiime2 were used to estimate the predictive power of microbial community pro les for site and obese phenotype. The classi cations were done with 500 trees based on 10-fold cross-validation using the QIIME "sampleclassi er classify-samples" plugin (Bokulich et al. 2018). A randomly drawn 80% of samples were used for model training, whereas the remaining 20% were used for validation. Further, the 30 most important ASVs for differentiating between site or obese phenotype were predicted and annotated.
Predicted metabolic gene pathway analysis: The functional potential of microbial communities was inferred using the Phylogenetic Investigation of Communities by Reconstruction of Unobserved States 2 (PICRUSt2) v2.5.1 with the ASV table processed to remove exceptionally rare taxa and the representative sequences as input les (Douglas et al. 2020). The metabolic pathway from the PICRUSt2 pipeline was annotated using the MetaCyc database (Caspi et al. 2016). The predicted MetaCyc abundances (unstrati ed pathway abundances) were analyzed with ANCOM-BC to determine differentially abundant pathway associations across sites and obese status. Site, age, sex, BMI were added as covariates in the ANCOM-BC formula to reduce the effect of confounders.
Statistical Analysis: All statistical analyses and graphs were done with R software. Kruskal-Wallis test and Permutational Analysis of Variance (PERMANOVA) test with 999 permutations using the Adonis function in the vegan package (Oksanen et al. 2013) were performed to compare alpha and beta diversity measures respectively with multiple groups comparison correction. PERMANOVA models were adjusted for BMI, age, sex for country whereas age, sex and country were accounted for in obese groups. Variables that showed signi cant differences in the PERMANOVA analyses, PERMDISP test was performed to assess differences in dispersion or centroids. For differential abundance analysis, the false-discovery rate (FDR) method incorporated in the ANCOM-BC library was used to correct p-values for multiple testing. A cut-off of P adj < 0.05 was used to assess signi cance. Spearman correlations were performed between concentrations of short chain fatty acids, Shannon diversity or concentrations of short chain fatty acids and differentially abundant taxa that were identi ed either among study sites or in obese and non-obese individuals. The resulting p-values were adjusted for multiple testing using the false-discovery rate (FDR). P value < 0.05 was considered statistically signi cant. A mixed model was built using lme4 package to assess whether total SCFAs could be predicted by Shannon diversity, obesity, and country, setting obesity and Shannon diversity as xed effects and random intercept by country.
Data availability: All 16S rRNA gene sequence data are publicly available via the QIITA platform (https://qiita.ucsd.edu) under the study identi er (ID=13512) and will soon be deposited on the European Bioinformatics Institute (EBI) site. The SILVA 16 S rRNA database used for alignment is available at https://data.qiime2.org/2022.2/common/silva-138-99-515-806-nb-classi er.qza. The data and analyses generated in this study are available within the paper, Supplementary Information and Source data les provided with this paper. Declarations includes data generated at the UC San Diego IGM Genomics Center utilizing an Illumina NovaSeq 6000 that was purchased with funding from a National Institutes of Health SIG grant (#S10 OD026929).  Table   Table 1      Mapping from FDR adjusted p values are denoted as: *, ** and ***, corresponding to p < 0.05, <0.01 and <0.001 respectively.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.